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Abstract 

It is not hard to write a first order formula which is true for a given graph G but is false 
for any graph not isomorphic to G. The smallest number D{G) of nested quantifiers in a 
such formula can serve as a measure for the "first order complexity" of G. 

Here, this parameter is studied for random graphs. We determine it asymptotically 
when the edge probability p is constant; in fact, D(G) is of order log n then. For very sparse 
graphs its magnitude is Q(n). On the other hand, for certain (carefully chosen) values of p 
the parameter D(G) can drop down to the very slow growing function log* n, the inverse of 
the TOWER-function. The general picture, however, is still a mystery. 



1 Introduction 

In this paper we shall deal with sentences about graphs expressible in first order logic. Namely, 
the vocabulary consists of the following symbols: 

• variables (x, y, yi, etc); 
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• the relations = (equality) and ~ (the graph adjacency); 

• the quantifiers V (universality) and 3 (existence); 

• the usual Boolean connectives (V, A, 44>, and =$•). 

These can be combined into first order formulas accordingly to the standard rules. A sentence 
is a formula without free variables. On the intuitive level it is perfectly clear what we mean 
when we say that a sentence is true on a graph G. This is denoted by G \= A; we write G \/= A 
for its negation (A is false on G). We do not formalize these notions; a more detailed discussion 
can be found in e.g. [20, Section 1]. 

Please note that the variables represent vertices so the quantifiers apply to vertices only, i.e. we 
cannot express sentences like "There is a set X having a given property" . (In fact, the language 
lacks any symbols to represent sets or functions.) We do not allow infinite sentences. As we do 
not go beyond first order logic, the standalone term "sentence" means a "first order sentence." 

From a logician's point of view, the first order properties of graphs form a natural class of 
properties to study. For example, the so-called zero-one laws for random graphs have been 
extensively studied: see e.g. [10, 8, 12, 13, 6, 19, 14, 18, 20]. 

Of course, if G \= A and H = G (i.e. H is isomorphic to G), then H \= A. On the other hand, 
for any graph G it is possible to find a first order sentence A which defines G, that is, G \= A 
while H y= A for any H ^ G. Indeed, let V{G) = {v\, . . . , v n }. The required sentence A could 
read: 

"There are vertices x±, . . . , x n , all distinct, such that any vertex x n +i is equal to one 
of these and x% ~ Xj iff {vi,Vj} £ E(G), 1 < i < j < n." 

However, this sentence looks rather wasteful: we have n + 1 variables, the ~-relation was used 
(2) times, etc. Of a number of possible parameters measuring how complex A is, we choose 
here D(A), the quantifier depth (or simply depth) which is the size of a longest sequence of 
embedded quantifiers. (In the above example, D(A) = n + 1.) This is a natural characteristics 
which appears, for example, in the analysis of algorithms for checking whether G \= A. Also, 
the depth function can be studied by using the so-called Ehrenfeucht game [7] (see Section 2 
here). Following Pikhurko, Veith and Verbitsky [16] (see also [17]) we let D(G) be the smallest 
depth of a sentence defining G. It is a measure of how difficult it is to describe the graph G in 
first order logic. 

A word of warning: the function D{G) does not correlate very well with our everyday intuition 
of how complex the graph G is. Such are the limitations of the first order language that, for 
example, D(K n ) = D(K n ) = n + 1 is the largest among all order-n graphs but what can be 
simpler than the complete or empty graph?! 

Perhaps, this is just an unlucky example? This approach seems only partially helpful: it is 
shown in [16] that, certain exceptions aside, D(G) can be bounded by V ( G ^ +5 • however, the 
situation seems to get messy when we try to describe all graphs with D(G) > (^ — e)v{G). 
Besides a large homogeneous set, there are other obstacles which may push the depth up: 
Cai, Fiirer, and Immerman [5] constructed graphs G to define which we need £l{v (G)) nested 
quantifiers even if we add counting to first order logic. This is a rather drastic addition: for 
example, we need only two nested quantifiers to define K n with counting, namely, 

"There are precisely n vertices and every two of them are connected. " 
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The opposite approach was taken by Pikhurko, Spencer and Verbitsky [15]: what is g(n), the 
minimum D(G) over all graphs G of order n? It turned out that g(n) can be arbitrarily small 
in the following sense: for any recursive function / : N — > N there is n such that f(g(n)) < n. 
If we try to "smoothen" g(n) by defining 7(n) = maxj< n g(z), then 7(71) = ©(log* n). Here, the 
log-star log* n is the inverse to the TOWER-function, that is, the number of times we have to 
take the binary logarithm before we get below one: 

log* n = min{i G N : log^ n < 1}. 

Such a behavior is surprising and intriguing. Having studied the two extreme cases, we con- 
centrate now on what happens in a typical graph. More generally, we consider the standard 
Erdos-Renyi model G £ Q(n,p), where p denotes the edge probability. Of course, we are in- 
terested in events occurring whp (with high probability, that is, with probability 1 — o(l) as 
n — > 00). While a zero-one law studies the probability that a fixed sentence holds, we take a 
random graph and ask what the 'simplest' sentence defining it is. 

As D[G) = D(G), we can assume without loss of generality that p < \. 

In Section 3 we study the case when < p < \ is a constant and show that whp 

D(G) = \og l j p n + O (In Inn). (1) 

The case p = \ is always of particular interest: G <G Q(n, \) is uniformly distributed among all 
graphs of order n. In Section 4, we have found a different line of argument (as far as the upper 
bound is concerned), which allowed us to pinpoint D{G) down to at most 5 distinct values for 
infinitely many n. Unfortunately, this approach does not seem to work for p 

In Section 5 we show that for p < D(G) is determined by the number of isolated vertices 
and therefore is of order 0(n). We believe (cf. Conjecture 20) that the giant component, which 
appears around p = ^, has negligible effect on D(G) as long as p = 0(n _1 ) but we were not 
able to prove this. 

Rather surprisingly, for some carefully selected p = p(n) the function D(G) can be as small as 
0(log* n). The reason is that the integer arithmetics can be modeled over the obtained random 
graphs while integers can be defined by first order sentences of very small depth. We do not 
present an exhaustive general theorem but give an example demonstrating this phenomenon 
when p = n" 1 / 4 . On the other hand, the upper bound 0(log* n) is sharp, up to a multiplicative 
constant, cf. Theorem 21. 

The first order complexity of G € Q(n,p) for the general p remains a mystery. Open problems 
and conjectures are scattered throughout the text. See also Section 7 for some concluding 
remarks. 



2 The Ehrenfeucht Game 

For non-isomorphic graphs G and G 1 let D(G, G') be the smallest quantifier depth of a first 
order sentence A distinguishing G from G' (that is, G \= A while G' \£ A). As the negation 
sign does not affect the depth, we have D(G, G') = D(G', G). 
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Lemma 1 For any graph G we have 

D(G) = m&x{D(G, G') : G' ¥ G}. (2) 

Proof. Clearly, D(G, G') < v(G) + 1, so the right-hand side of (2) is well-defined. Theorem 2.2.1 
in [20] implies that all graphs can be split into finitely many classes so that any first order 
sentence of depth at most v(G) + 1 does not distinguish graphs within a class. For each class, 
except the one which contains G, pick a representative G' and let Aq> be a minimum depth 
sentence distinguishing G from G' . The disjunction of these Aq> proves the '< '-inequality in (2). 

The converse inequality is trivial. I 

In the remainder of this section, we describe the Ehrenfeucht game which is a very useful com- 
binatorial tool for studying D(G,G'). It was introduced by Ehrenfeucht [7]. Earlier, Fraisse [9] 
suggested an essentially equivalent way to compute D(G,G') in terms of partial isomorphisms 
between G and G' . A detailed discussion of the game can be found in [20, Section 2]. 

Let G and G' be two graphs. By replacing G' with an isomorphic graph, we can assume that 
V(G) (~1 V(G') = 0. The Ehrenfeucht game Ehr/ c (G, G 1 ) is played by two players, called Spoiler 
and Duplicator and consists of k rounds. For brevity, let us refer to Spoiler as 'him' and to 
Duplicator as 'her'. In the i-th round, i = 1, . . . , k, Spoiler selects one of the graphs G and G' 
and marks some its vertex by i; Duplicator must put the same label % on a vertex in the other 
graph. (A vertex may receive more than one mark.) At the end of the game (i.e. after k rounds) 
let xi,...,Xk be the vertices of G marked 1, . . . , k respectively, regardless of who put the label 
there; let x[, . . . ,x' k be the corresponding vertices in G' . Duplicator wins if the correspondence 
X{ * ^ x^ is a partial isomorphism, that is, we require that {xi,Xj} € E{G) iff {x'j, x'j} £ E(G') 
as well as that iff x[ = x'j. Otherwise, Spoiler wins. 

The crucial relation is that for any non-isomorphic G and G' the smallest r such that Spoiler has 
a winning strategy in EHR r (G, G') is equal to D(G,G'). In fact, an explicit winning strategy 
for Spoiler gives us an explicit sentence distinguishing G from G' . 

If Spoiler can win the game, alternating between the graphs G and G' at most r times, then 
the corresponding sentence has the alternation number at most r, that is, any chain of nested 
quantifiers has at most r changes between 3 and V. (To make this well-defined, we assume 
that no quantifier is within the range of a negation sign.) Let D r (G) be the smallest depth 
of a sentence which defines G and has the alternation number at most r. Clearly, D r [G) = 
ma,x{D r (G, G') : G' ^ G}, where D r (G, G') may be defined as the smallest k such that Spoiler 
can win Ehr^G, G') with at most r alternations. 

For small r, this is a considerable restriction, giving a qualitative strengthening of the obtained 
results. Therefore, we make the extra effort of computing the alternation number given by our 
strategies if the obtained r is really small. 

Finally, let us make a few remarks on our terminology. When a player marks a vertex, we may 
also say that the player selects (or claims) the vertex. Duplicator loses after i rounds if the 
correspondence between (x\, . . . , Xi) and (x' 1: . . . , x'j) is not a partial isomorphism. (Of course, 
there is no point in continuing the game in this situation.) 
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3 Constant Edge Probability 



As D(G) = D(G), we can assume without loss of generality that p < \. For brevity let us 
denote q = 1 — p. In this section we prove the following result. 



Theorem 2 Let p be a constant, < p < \. Let G <G Q(n,p). Then whp 



0(1)<D{G)- log 1/p n + 2 log 1/p In n < (2 + o(l)) 



In Inn 



(3) 



— plnp — qlnq 



The lower bound follows by observing that if for any disjoint A,BcG with \A\ + \B\ < k, 
there is a vertex y connected to everything in A but to nothing in B (this is called the k- 
extension property or the k- Alice Restaurant property), then D(G) > k + 2. The upper bound 
is obtained by some kind of recursion, where for every x £ G we write a sentence A x describing 
its neighborhood T(x). Whp no two neighborhoods are isomorphic so A x "defines" x and the 
final sentence A stipulates that the (unique) vertices satisfying A x and A y are connected if 
and only if x,y € G are. As each recursive step reduces the order by a factor of about 1/p, 
the obtained sentence has depth around log^n. There are some technicalities to overcome. 
However, the combinatorial setting of the Ehrenfeucht games makes the proof more transparent 
and accessible. 

Unfortunately, we have hardly any control on the alternation number in Theorem 2. The 
following result fills this gap by providing the defining sentences of a very restrictive form: no 
alternation at all. This, however, comes at the expense of increasing the depth by a constant 
factor. 

Theorem 3 Let p be a constant, < p < 1. Let G € Q{n,p). Then whp 



Remark. If we are happy to bound D\ only, then the constant in (4) can be improved: in the 
proof (Section 3.3) we have to use Lemma 8 instead of Lemma 10. 

3.1 The Lower Bound 

To prove the lower bound in (3) we use the following lemma. 
Lemma 4 If G has the k- extension property, then D(G) > k + 2. 

Proof. Let G' ^ G be another graph which has the /c-extension property. (For example, we 
can take a random graph of large order.) Consider EHRj. + i(G, G'). Duplicator's strategy is 
straightforward. If in the i-th round Spoiler selects a previously marked vertex, Duplicator 
does the same in the other graph. Otherwise, she matches the adjacencies between Xj and 
{xi, . . . , Xi-i} to those between x\ and {x[, . . . , by the fc-extension property. I 



D (G) < (2 + o(l)) 



Inn 



(4) 



— ln(p 2 + q 2 ) 
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It is easy to show that whp G G Q(n,p), for constant p G (0, |], has the |_?"J -extension property 
with 



r = log 1/p n-21og 1/p lnn + log 1/ pln(l/p) - o(l), (5) 



which gives us the required lower bound by Lemma 4. Indeed, for k < r — 0(1), the expected 
number of 'bad' A,Bc V(G) with \A\ + \B\ = k can be bounded by 



3.2 The Upper Bound 

Let G = (V, E) be a graph. Let Vi consist of all ordered sequences of i pairwise distinct vertices 
of G. For x = (x\, . . . , xi) G Vi define 



and G x = G[V X ]. We abbreviate G xli ... ;Xi = GV Xli ..., Xi ), etc. Let us agree that V_i = 0, Vo = {()} 
consists of the empty sequence, and Gq = G. 

The following lemma specifies our global line of attack. 

Lemma 5 Suppose that a graph G, numbers I > and lo > 3 satisfy all of the following 
conditions. 

1. For any x G Vi-\ U Vi we have D(G X ) < lo. 

2. For any i < I — 1, x G Vi, and distinct y,z G V x , the following two conditions hold. Let 



a. Any injection f : U — > V XjV which embeds G XjVjZ as an induced subgraph into G XjV is 
the identity mapping. (In particular, G X)2/;2 admits no non-trivial automorphism.) 

b. There is a vertex v G V x>y \ U such that for any vertex w G V XjZ \ U we have 
T(v) fl U r(u;) n U, where T denotes the neighborhood of a vertex. 

3. For any i<!-l,x£ Vi, and distinct y, z,w G V x , G XtVtZ is not isomorphic to an induced 
subgraph ofG x>w . 

Then D{G) <l + l . 

Proof. Let us observe first that Condition (2.) (or Condition (3.)) implies that 

G xu ..., Xi , y ^ Gx U -,xi,z, for any i < I - 1, (xi,.. • , Xi) G V,, and distinct y,z G V xlj ... jXi . (6) 

We prove the lemma by induction on I. If I is or 1, then Condition (1.) alone implies the 
claim. So, let I > 2. Let G' = (V, E') be any graph which is not isomorphic to G. 




V x = {y G V \ {xi, . . . , x^ : V? G [i] {y, Xj } G E}, 



U = V, 
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Case 1 Suppose that there is x G V such that G x ^ G' y for any y G V' . Spoiler selects this 
x. Let x' be the Duplicator's reply. The graph G x satisfies all the assumptions of Lemma 5 
with I decreased by 1. Spoiler will always play inside one of G x or G ' ,. We can assume that 
Duplicator does the same for otherwise the adjacencies to x and x' do not correspond. As 
G x ^ G' x ,, Spoiler can use the induction to win the (G x , G' x ,)-game in at most I + Iq — 1 moves, 
as required. | 

The same argument works if there is x G V' such that G' x ^ G y for any y G V. 

Case 2 Suppose now that there are x G V and distinct y',z' G V such that 

G x ^G' y ,^G' z ,. (7) 

Spoiler selects y' G V . Assume that Duplicator replies with y = x, for otherwise G y ^ Gy 
by (6) and Spoiler proceeds as in Case 1. Now Spoiler selects z'; let z G V be the Duplicator's 
reply. We can assume that 

Gy, z = G' y/z ,, (8) 

for otherwise Spoiler applies the inductive strategy to the (G y;Z , Gy i2 /)-game, where I is reduced 
by 2. 

We show that Spoiler can win in at most 3 extra moves now. Let U = V v z and U f = VL 

y y ,z 

Spoiler selects the vertex v G V y \U given by Condition (2b.). Let v' G V y ,\U' be the Duplicator's 
reply. By Condition (2a.) and (7)-(8) there is a bijection g : V', — > V z , which is the identity on 
U' and induces an isomorphism of G' y , onto G' z ,. Spoiler selects w' = g(v'). Whatever the reply 
w G G z \ U of Duplicator is, I» n U + T(w) n U. But in G' we have T{v') D U' = T(w') n U'. 
Spoiler can point this difference with one more move into U. The total number of moves is 
5 < I + Iq, as required. I 

By (6), the only remaining case is the following. 

Case 3 Suppose that there is a bijection g : V — > V such that for any x G V we have 

G x = G g ( x y 

As G ^ G', there are y, z G V such that g does not preserve the adjacency between y and 
z. Spoiler selects y. We can assume that Duplicator replies with y' = g(y) for otherwise 
Spoiler proceeds as in Case 1. Now, Spoiler selects z to which Duplicator is forced to reply 
with z' g(z). Assume that G y>z = G y i jZ i for otherwise Spoiler applies the inductive strategy 
for I — 2 to these graphs. But then G VjZ is an induced subgraph of G w , where w = g^ 1 (z'), 
contradicting Condition (3.). I 

In order to finish the proof of Theorem 2 we apply Lemma 5 to a random graph G G Q(n,p). 

Let I = logi/pii — Clogjyplnn — 1 and £o = Co In Inn where C and Cq are constants such that 
C > 2 and Co(— plnp — qlnq) > C. Let m = np l+1 = \n c n. Let e > be a small constant. 
Let n be sufficiently large. 

Let V = u{±JVi. Observe that |V| < e (1+£) 1os Vp n lnn = e ( ln2 «). 

Lemma 6 Whp for any i < I + 1 and x G Vj we have 

\\V X \ -p l n | < ep l n. (9) 
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Proof. Fix some x 6 V;. The size of V x has the binomial distribution with parameters (n — i,p % ). 
By Chernoff's bound ([1, Appendix A]), the probability p' that this x violates (9) is 

p > < 2e~ £2npt/3 < 2e~ £2m/3 = oQVr 1 ). (10) 
Thus the expected number of 'bad' x's is o(l), giving the required. | 

Lemma 7 Whp Condition (2.) holds. 

Proof. Fix i < I — 1, x € V«, and y, z <G V x . Let U = F X)!/iZ , = V^y, u = |J7|, and w = \W\. 

First we deal with Condition (2a.). Take j 6 [l,tt]. Let 5 be any injection from U into W such 
that |C/ S | = j, where U g = {v 6 f/ : 5(f) 7^ consists of the elements moved by g. Let the 
same symbol g denote also the induced action on edges. Let E g consist of those e € (^) such 
that g(e) 7^ e. It is not hard to see that 

\E g \ > Q + j(u - j) - J - = j{u - j/2 - 1). 

We can find a set D C E g of size at least |-E ff |/3 such that D n g{D) = 0. We do so greedily: 
choose any e £ move e to -D, and remove g , (e),5~ 1 (e) from S ff (if they belong there). The 
probability that, for all e £ -D, the 2-sets e and g(e) are simultaneously edges or non-edges is 
(p 2 + g ,2 )' D ' because these events are independent. This gives an upper bound on the probability 
that g induces an isomorphism. 

Given j, there are at most {^)w^ choices of g. The sequence (x, y) (or (x, y, z)) violates (9) with 
probability at most p', where p' is as in (10). Thus we can bound the probability that x, y, z 
violate Condition (2a.) by 

2p> + £ ( U )wl(p 2 + g2)i(«-i/2-l)/3 < 2j/ + (p 2 + g 2 )( I- £ )m_ 

Hence, the expected number of bad witnesses x, y, z is at most 

|V| (2p' + (p 2 + g 2 )(|- £ ) m ) =o(l), 
giving the required by Markov's inequality. 

To estimate the probability that (2b.) fails, fix some v € W \ U. The probability that some 
vertex of V XtZ \U has the same neighborhood in U as v is at most (p 2 +q 2 ) u . We have u > (l—e)m 
with probability at least 1 — p' . Hence, v does not satisfy Condition (2b.) with probability at 
most 

\V X , Z \ (p'+(p 2 + q 2 ) {1 ' £)m )=o(\V\\ 

finishing the proof. I 

Condition (3.) is verified similarly to the argument of Lemma 7. (The proof is, in a way, even 
easier because |V^ )2/;2 \ V w \ = fl(m) whp.) All that remains to check is Condition (1.). To deal 
with it, we need another strategic lemma. For a subset X of vertices of G = (V, E), define the 
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equivalence relation =x on V, called the X -similarity, by x =x yiffx = yorx,y£V\X 
satisfy T(x) fll = T(y) n X. This is an equivalence relation. Let 

S(X) = {xeV :VyeV (y^x => y ^ x x)} D X. 

The vertices in S(X) are sifted out by X (that is, are uniquely determined by their adjacencies 
to X). We call X a sieve if S(X) = V. 

Lemma 8 Let X C V. Define Y = S(X). IfS(Y) = V, then D^G) < \X\ + 3. 

Proof. Let G' ^ G. First, Spoiler selects all of X. Let X' C V be the Duplicator's reply. 
Assume that Duplicator has not lost yet. For the notational simplicity let us identify X and 
X' so that V n V = X = X' and our both graphs coincide on X. Let Y' = S G ,{X). 

It is not hard to see that Spoiler wins in at most two extra moves unless the following holds. 
For any y G Y \ X there is a y' G Y' \ X (and vice versa) such that T(y) f] X = T(y') n X. 
Moreover, this bijective correspondence between Y and Y' induces an isomorphism between 
G[Y] and G'[Y']. 

Clearly, if Duplicator does not respect this correspondence, she loses immediately. Therefore, 
we may identify Y with Y' . Let Z = V \ Y and Z' = V'\ Y. Let z G Z and define 

W' z = {z G Z' : T(z') n Y = T{z) n 1"} 

If W' z = 0, Spoiler wins in at most two moves. First, he selects z. Let Duplicator reply with 
z' £ Z'. As the neighborhoods of z, z' in Y differ, Spoiler can highlight this by picking a vertex 
of Y. If \W' Z \ > 2, then Spoiler selects some two vertices of W' z and wins with at most one more 
move, as required. 

Hence, we can assume that for any z we have W' z = {f(z)} for some f(z) G Z 1 . It is easy to 
see that / : Z —> Z' is in fact a bijection (otherwise Spoiler wins in two moves). As G ^ G', 
the mapping / does not preserve the adjacency relation between some y,z G Z. Now, Spoiler 
selects both y and z. Duplicator cannot respond with f(y) and f(z); by the definition of / 
Spoiler can win in one extra move. I 

By Lemma 8, to complete the proof of Theorem 2 it suffices to verify that whp for any x G 
Vi-i U Vi there is an (Iq — 3)-set X C V x such that, with respect to H = G x , 5(1") = U, where 
Y = S(X) and U = F x . Let k = l Q - 3 and u = \U\. Fix any X G (^). With probability at 
least 1 — p' we have up 2 < (1 + e)m. Conditioned on this, G x is still constructed by choosing 
its edges independently. The probability that a vertex y G U \X belongs to Y is 

E^W-* (i-pV^)"""" 1 (ii) 

We want to bound this probability from below. Let, for example, io = pk — k 1 / 2 In k. For 
iq < i < k we have (1 —p l q k ~ l ) u > 1 — e by the definition of Co- Chernoff's bound implies that 
~52i=i { > l)p i Q k ~ i > 1 — e as this sum corresponds to the Binomial distribution with parameters 
(k,p). Hence, the expression (11) is at least (1 — e) 2 > 1 — 2e and the expectation 

E[\Y\] > (l-2e)u. 
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We construct the martingale Yq,..., Y u _k, where we expose the vertices of U \ X one by one 
and Yi is the expectation of \Y\ after i vertices have been exposed. Changing edges incident 
to a vertex, we cannot decrease or increase \Y\ more than by two. By Azuma's inequality ([1, 
Theorem 7.2.1]), the probability that \Y\ drops, say, below (1 — 3e)u is at most e -f ^ m ) = o(|V|). 
Whp each Y has at least (1 — 3e)u elements. The following simple lemma completes our quest. 

Lemma 9 Let e = e(p) > be sufficiently small. Whp for any x G V;_i U Vi, every set 7 C7 X 
of size at least (1 — 3e)u, u = \V^\, is a sieve in G x . 

Proof. Let x satisfy (9). The expected number of bad triples (Y,y,z) (that is, the distinct 
vertices y, z G V x \ Y have the same neighborhood in a set Y of size at least (1 — 3e)u) is 



The claim follows from (10). I 

3.3 Games with no Alternations 

Following our standard scheme, we first specify a graph property which ensures the desired 
bound on D(G) and then show that a random graph satisfies this property whp. 

Lemma 10 Assume that in a graph G = (V, E) we can find X C V such that 

1. X is a sieve; 

2. G[X] has no nontrivial automorphism; 

3. G has no other induced subgraph isomorphic to G[X]. 

Then D Q (G) < \X\ + 2. 

Proof. Let G' be an arbitrary graph non-isomorphic to G. For some G' Spoiler plays all the 
time in G, for others he plays all the time in G' . 

We first describe the strategy when Spoiler plays in G. Spoiler selects all vertices in X. Suppose 
that Duplicator managed to establish </> : X — > X', a partial isomorphism from G to G', where 
X' C V is the set of Duplicator's responses. Denote Z = V{G) \ X and Z' = V{G') \ X' . We 
call two vertices, v G Z and v' G Z' <p- similar if the extension of 4> which takes v to v' is a 
partial isomorphism from G to G'. Four cases are possible: 

Case 1 The (^-similarity is a one-to-one correspondence between Z and Z'. 

Case 2 There is v G Z without a 0-similar counterpart in Z'. 




2\u—i 
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Case 3 There is v' G Z' without a ^-similar counterpart in Z. 

Case 4 There are G Z' with the same 0-similar counterpart in Z. 

In Case 1 there are v\,V2 G Z with adjacency different from the adjacency between their 0- 
similar counterparts in Z'. Spoiler selects v\ and V2 and wins. In Case 2 Spoiler wins by 
selecting the vertex v. In Cases 3 and 4 Spoiler fails in this way but plays differently from the 
very beginning. 

Namely, if there exist X' and a partial isomorphism <p : X — > X' such that Cases 3 or 4 occur, 
Spoiler begins with selecting all vertices in X' . Duplicator is forced to reply in accordance with 
4> due to the conditions assumed for X. Then Spoiler selects the vertex v' in Case 3 or v[ and 
v' 2 in Case 4 and wins. I 



Lemma 11 Let e > and < p < 1 be fixed. Let G G Q(n,p) and let X C V be any set of 
size t > (2 + e) log^ n, where r = p 2 + g 2 . Then whp Conditions (l.)-(3.) of Lemma 11 hold. 

Proof. The expected number of vertices with the same neighborhood in X is at most n 2 r l = o(l), 
implying (1.). Conditions (2.) and (3.) follow from the following claim. 

Claim 1 Whp no injective g : X — > V, with the exception of the identity mapping, preserves 
the adjacency relation. Proof of Claim. Fix g. Let k = \K\ and I = \L\, where 

K = {x G X : g(x) X], 
L = {x G X : g(x) G X \ {x}}, 

As in the proof of Lemma 7 we can find a set D C ( X 2 K ) °f s i ze a * l eas t ' t .zJSzU2zl suc h that 
D n g(D ) = 0. The latter property still holds if we enlarge D by the set of all elements of (^) 
incident to K. Hence, the total probability of failure is at most 

£ (f) (* " k )nHV = 0(1)j 

0<fc+i<< ^ ^ ^ / 

completing the proof of the claim and the lemma. I 



4 Edge Probability 1/2 

Here is the main result of this section. 

Theorem 12 Let G G Q(n, ^). For infinitely many values of n we have whp 

D 2 (G) < log 2 n- 21og 2 lnn + log 2 ln2 + 6 + o(l). (12) 



Remark. The lower bound given by Lemma 4 and the case p = \ of (5) is by at most 5 + o(l) 
smaller than the upper bound in (12). This implies that D(G) and Dz(G) are concentrated on 
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at most 6 different valued for such n. In Section 4.3 we will show that whp we have only 5 
possible values. 

Before we start proving Theorem 12, let us observe that for p = ^ and an arbitrary n the upper 
bound (3) can be improved by using Lemma 8 to 



(Details are left to the interested Reader.) 
4.1 Spoiler's Strategy 

Before we can specify the plan of our attack on Theorem 12, we have to give a few definitions. 
Let G = (V, E), W C V, and u G N. Building upon the notions defined before Lemma 8, let 



In other words, a vertex y £ W belongs to S U {W) iff there is a u-set U C V \ (W U {y}) such 
that U U W sifts out y. Note that S{W) = S (W) U W. 

Lemma 13 Let Y = S U (W). Suppose that Y U W is a sieve in G and that no two vertices of 
Y have the same neighborhood in W . Then D2(G) < u + w + 4 ; where w = \W\. 

Proof. Let G' = (V',E') be a graph non-isomorphic to G. We describe a strategy allowing 
Spoiler to win the game EHR n+w+ 4(G, G'). 

Spoiler first claims W . Let Duplicator reply with W' C V . Assume that she does not lose in 
this phase, establishing a partial isomorphism / : W — > W. Recall that we call two vertices 
v & V and v' € V f -similar if the extension of / taking v to v' is still a partial isomorphism 
from G to G'. Let Y' = S U {W). 

Claim 1 As soon as Spoiler moves inside Y UY' but Duplicator replies outside YLiY', Spoiler 
is able to win in the next u + 1 moves with 1 alternation between the graphs. 

Proof of Claim. Assume for example that, while Spoiler selects y £ Y, Duplicator replies with 
y' Y'. Spoiler selects some u-set U with y sifted out by U U W. Let the reply to it be U'. By 
the assumption on y', there is another vertex z' with the same adjacencies to U' U W'. Spoiler 
selects z' and wins. I 

Claim 2 If Y' contains two vertices with the same adjacencies to W', then Spoiler is able to 
win the game in w + u + 3 moves with at most 2 alternations. 

Proof of Claim. Assume that y' and z' are both in Y' and have the same adjacencies to W . 
Spoiler selects these two vertices. In order not to lose immediately, Duplicator is forced to reply 
at least once outside Y. Spoiler wins in the next u + 1 moves according to Claim 1. I 



D\{G) < log 2 n — log 2 In n + O (In In In n) . 




ucv\w 

\U\=n 
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Assume therefore that all vertices in Y' have pairwise distinct neighborhoods in W. This 
assumption and Claim 1 imply that either the /-similarity determines a one-to-one correspon- 
dence between Y and Y' or Spoiler is able to win the game in w + u + 3 moves with at most 2 
alternations. We will assume the first alternative. Extend / to a map from W^U Y onto W'UY' 
accordingly to the /-similarity correspondence between Y and Y 1 . 

Claim 3 Suppose that Duplicator failed to respect the bijection / after a Spoiler's move into 
Y U Y'. Then Spoiler can win in at most u + 1 extra moves, during which he alternates at most 
once. 

Proof of Claim. Suppose that the previous move x of Spoiler was in G, for example. Clearly, 
the Duplicator's response x' cannot belong to Y' because f(x) is the only vertex in Y' with the 
required VF'-adjacencies. Spoiler applies the strategy of Claim 1. | 

Claim 4 If / : W U Y — > W U Y' is not a partial isomorphism from G to G' , then Spoiler is 
able to win the game in w + u + 3 moves with 1 alternation. 

Proof of Claim. Assume, for example, that {2/1,1/2} G (^) is an edge while {/(yi), /(z/2)} is 
not. Spoiler picks y\ and yi- Duplicator cannot reply with /(yi) and /(j/2) so Spoiler wins in 
at most w + 2 + (u + l)=w + u + 3 moves by Claim 3. I 

Assume therefore that / : W U Y — > W' U Y' is a partial isomorphism. Denote R = V\ (W U 1") 
and R! = V'\ (W'UY'). 

Claim 5 As soon as Spoiler moves inside RUR' but Duplicator fails to reply with an /-similar 
vertex in R U R' (in the other graph), Spoiler can win in at most u + 2 extra moves, during 
which he alternates at most once. 

Proof of Claim. If Duplicator replies with a vertex i£7U Y', in the next move Spoiler marks 
the /-mate of x and then applies the strategy of Claim 3. If she replies in R U R' but not with 
an /-similar vertex, Spoiler highlights this in one more move and again uses Claim 3. I 

Claim 6 If W' U Y' is not a sieve in G', then Spoiler is able to win the game in w + u + 4 
moves with 2 alternations. 

Proof of Claim. Spoiler picks two witnesses z[, z 2 G R' with the same adjacencies to W U Y' . 
If at least one of the corresponding replies zi,z 2 is not in R, Spoiler applies the strategy of 
Claim 5. Otherwise z\ and z 2 belong to different W U Y-similarity classes and there is a vertex 
x £WUY adjacent to exactly one of Z\ and z 2 . If x G W, Spoilers wins immediately. If x G Y, 
Spoiler picks f(x) G Y' . Duplicator cannot respond with x. Now Spoiler wins the game in at 
most u + 1 extra moves by Claim 3. I 

In the rest of the proof we suppose that W' U Y' is a sieve. This assumption and Claim 5 
imply that either the /-similarity determines a one-to-one correspondence between R and R' or 
Spoiler is able to win the game in w + u + 3 moves with 2 alternations. Let us assume the first 
alternative. Extend / to the whole of V accordingly to the /-similarity correspondence between 
R and R'. Thus / is a bijection between V and V now. As G and G' are not isomorphic, / 
does not preserve the adjacency for some {2/1,2/2} G (^) . Spoiler selects y\ and 2/2- If Duplicator 
replies with /(2/1) and /(j/2)) she loses immediately. Otherwise, Spoiler applies the strategy of 
Claim 5 and wins, having made totally at most u + w + 4 moves and 1 alternation. I 
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4.2 The Probabilistic Part 



Let k be given. For simplicity let us assume that k is even. Define 

fcN 

k 



f(n,k)=[ n , a )(n-fc)(l-2-*y 



Basic asymptotics show that for n = Q(k 2 2 k ) we have ^j^^ ~ 1 and thus we can find 
n = + o(l)) k 2 2 k such that 

/(n,fc) = (10 + o(l)) log 2 n. 
We fix this n. Routine calculations show that 

k < log 2 n - 2 log 2 In n + log 2 In 2 + 1 + o(l). (13) 

Let A be a fixed ^-subset of G £ {?(w, ^). Let consist of pairs (U,y), where U is a k-set 
containing A and y € V \ f7. For (U, y) G let I(U, y) denote the indicator random variable 
for the event y G S(U). We define 

(t/,j/)6W 

M = \U\ = ( n ;t /2 )(n-k), 



k/2 

p = E[I(U,y)} = {1 - 2 - k r- k -\ 



We further set 

M = = = /(n, fc) = (10 + o(l)) log 2 n. (14) 

The idea behind these definitions is that we try to apply Lemma 13 for W = A and u = |. 
Then [i is the expected number of ways to construct a vertex of S U {W). Our proof works only 
if \x is neither too big nor too small, that is, for some special values of n only. We do not know 
if D(G) can pinned down to O(l) distinct values for an arbitrary n. 



As k log 2 n we have 

M ~ 

(fc/2) 

As jU = we further have 

^ e -"2- fc =n -f(l+o(l)). (15) 



M«^ T =nf( 1+ °( 1 )). 



Lemma 14 For distinct (Ui,yi), (U 2 ,y 2 ) £ U, 

E[I(U 1 ,y 1 )I(U 2 ,y 2 )]<n-'f( 1+ °^. (16) 

Mien |£/i nU 2 \ < f§ 

/(f/2,2/2)] < EiliUuyi)} E[I(U 2 ,y 2 )} (1 + 0(n-^ 1+0 ^)) (17) 
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Proof. Condition on the adjacency patterns of 2/1, 2/2 to U±, U2 respectively. Let z be any vertex 
not in Ui U U2 U {2/1,2/2}- Suppose (the main case) [/1 7^ L/2. Then 

Pr[ (z = Vl yi ) A (z =[/ 2 y 2 ) } < 2~ k ~ 1 (18) 

as the adjacency pattern of z to U\ U is then determined. When C7i = the adjacency 
patterns of 2/1,2/2 to must be different as otherwise I(Ui,yi) = I(U2,y2) = 0. Then it would 
be impossible to have z =u 1 yi and z =u 2 2/2 so (18) still holds. By inclusion-exclusion 

Pr[ (z = Vl Vl ) V (z = U2 2/2) ] > 2 • 2~ k - 2- k - 1 = 3 • 2"*- 1 . 

If I(U\,yi) 7(^2,2/2) = 1 then this fails for all such z. But these events are mutually indepen- 
dent. Thus, by (15) 

E[I(U 1 ,y 1 ) /(C/2,2/2)] < (1 - 3 ■ 2 - fe - 1 )"- 2fc - 2 = n-'f ( 1+0 «), (19) 
giving the required. 

Suppose further that \U\ n L/2I < fjj- Again let z be any vertex not in U\ U U2 U {2/1,2/2} and 
condition on the adjacency patterns of 2/1, 2/2 to £7i, C/2 respectively. Now 

nfc 

Pr[ (z = Vl 2/1) A (z =[/ 2 2/2) ] < 

as this event requires z to have a given adjacency pattern to U\ U L^- Again by inclusion- 
exclusion 

Pr[ (z = Vl 2/1) V (z =t/ 2 2/2) ] > 2 • 2- fc - 2-W. 

As with (19) we deduce 

(j life \ n—2k~2 

l-2-2- fc + 2-KTj 

Now we want to compare this to E[I(U\,yi)]E[I(U2,y2)]- We have 

(1 - 2 • 2- k + 2- i nr)"- 2fc - 2 = ((1 - 2 - fe f- fe - 1 ) 2 (l + 0(n2~^)), 
yielding (17). I 



Lemma 15 

Var[X] =0{E[X]). (20) 

Proof. As X = YI I(U, y), the sum of indicator random variables, we employ the general bound 

VarfAAl < E[X] +'£Cav[I(U 1 ,y 1 ), /(C/2,2/2)], (21) 

the sum over distinct (Ui, y±), (U2, 2/2) £ W. The first term is fi. Consider the sum of the 
covariances satisfying \Ui l~l ^1 > f§- There are M choices for {U\,yi). For a given C/i there 
are nro( 1 +°( 1 )) choices for (t/2,2/2) and 

^^[/(C/1,2/1) /(t/2,2/2)] < Mn^ 1+ °™ n -f( 1 +°( 1 )) = (1). (22) 
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As the covariance of indicator random variables is at most the expectation of the product, 

J2Cov[I(U 1 ,y 1 ),I(U 2 ,y 2 )]=o(l), (23) 

where in (22)-(23) the sum is restricted to \Ui fl U 2 \ > f§. When | U\ n U 2 1 < j$ we have from 
(17) that 

Cov[ /([/!, Vl ), I(U 2 ,y 2 )} = O (EiliUiM)] E[I(U 2 ,y 2 )} n'^ 1+ °^ . 

Hence (the sum over \Ui fl U 2 \ < |^) 

Cav[I(U u yi), I(U 2 ,y 2 ) ] = OfV^^) £ E[I(U u y{)] E[I(U 2 ,y 2 )] 

But Y.E[I(U!, yi )] E[I{U 2 ,y 2 )} over all (U 1: Vl ), (U 2 , y 2 ) G U is precisely fi? = 0(ln 2 n). This 
becomes absorbed in the n~w term and (the sum again over \U\ fl U 2 \ < |jj) 

£Cov[J(l7i,yi), /(f/ 2 ,y 2 )] =0(n-^( 1+ °«)). (24) 

In particular, all covariances in (21) together add up to o(l) and so we actually have the stronger 
result Var[X] < E[X] + o(l). I 

Lemma 16 Whp every pair (Ui,yi) ^ (U 2 ,y 2 ) from U with I(U±,yi) = I(U 2 ,y 2 ) = 1 satisfies 

1. XJ X n U 2 = A; 

2- y 2 £ U i; 

3. yi + y 2 ; 

4- yi i^A V2; 

5. For m, u 2 G U\ with u\ ^ u 2 , u\ ^a U2', 

6. For u\ G Ui,u 2 G U 2 , u\ ^a u 2 . 

Proof. From (22) the expected number of pairs (U±,yi) ^ (U 2 ,y 2 ) with I(U±, y\) = I(U 2 ,y 2 ) = 1 
and \Ui fl U 2 \ > f| is o(l). The total number of pairs (Ui,yi) / (U 2 ,y 2 ) with (Ui n U 2 ) \ 
A ^ is less than M 2 ^-. For each with |£/i fl {7 2 | < f§ a weak form of (17) gives that 
E[I(Ui,yi) I(U 2 ,y 2 )] < 2p 2 . Hence the expected number of such pairs with I(Ui,yi) = 
I(U 2 ,y 2 ) = 1 is bounded from above by M 2 ^-(2p 2 ) = 0((ln 4 n)/n) = o(l). Hence the proba- 
bility that (1.) fails is o(l). 

For (2.) we first employ (1.) and restrict attention to UinU 2 = A. The number of (Ui,y±), (U 2 ,y 2 ) 
with y 2 G Ui is less than M 2 ~^ and for each E[I(Ui, y±) I(U 2 , y 2 ) ] « p 2 so the expected num- 
ber with I(Ui,yi) I(U 2 ,y 2 ) = 1 is less than « (Mp) 2 ^ which is <3((ln 3 n)/n) = o(l). 
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For (3.) we first employ (1.) and restrict attention to U\ (~1 U2 = A. The number of such 
(Ui,yi), (C/2,2/2) with yi = y 2 is about M 2 n~ x and for each such Pr[7(C/i,yi) = L(U 2 ,y 2 ) = 1] ~ 
p 2 so the expected number of violations of (3.) is around {Mp) 2 n~ l = n 2 n~ l = o(l). 

For (4.) we first employ (l.)-(3.) and restrict attention to pairs satisfying those conditions. 
The number of such pairs is ~ M 2 . For each such Pr[J(£/i, y\) = I(U 2 ,y2) = 1] ~ P 2 and, 
conditioned on this, Pr[yi =a J/2] ~ 2~ fc / 2 . Hence the expected number of pairs violating (4.) 
is « {Mp) 2 2- k / 2 = fi 2 2- k / 2 which is again o(l). 

For (5.) there are M choices of U\,y\ and 0(k 2 ) choices for u\,U2- For each Pt[I(U\, y\) = 1] = 
p, Pr[ui =,4 7x2] = 2~ fc / 2 and these events are independent so the expected number of violations 
of (5.) is 0{Mpk 2 2- k / 2 ) which is o(l). 

For (6.) we restrict attention to those cases satisfying (1.)— (4.). There are less than M 2 choices 
of Ui,yi,U2,y2 and 0(k 2 ) choices for u\,U2- For each Pt[I(Ui, j/i) = I(U 2 ,y 2 ) = 1] « p 2 , 
Pr[ui =a ^2] = 2~ fc / 2 and these events are independent so the expected number of violations 
of (6.) is 0{M 2 p 2 k 2 2~ k / 2 ) which is o(l). I 

Lemma 17 Let G € Q(n, ^) and A be a fixed subset of | vertices, as above. Let Z denote the 
union of all sets U — A where some I(U,y) = 1 and let S = Sk/ 2 {A) denote the set of all such 
vertices y. Let R = V \ (A U Z U S) . Then whp 

1. All y 1 , y 2 G S have yi ^ A J/2- 

2. All u±,U2 € Z have u\ u 2 ■ 

3. There are no distinct z±, z 2 , z$, Z4 G R with Z\ =5 z 2 and Z3 =5 Z4. 
4- There are no distinct z±,z 2 ,Z3 £ R with z\ =5 z 2 =5 23 

Proof. The first two statements are the conclusions (4.)— (6.) of Lemma 16. We concentrate on 
showing (3.) as (4.) is similar. Set 

I = fi — n 0& (25) 

Let Y denote the number of /-sets {(Ui, yi) : 1 < i < 1} (counting permutations of the (U,yi) 
as the same) and z\,z 2 ,zj,, z^ satisfying 

• I(Ui,yi) = 1 for 1 < i < I. 

• The Ui — A are disjoint, the yi are distinct, and no y« G Uj. 

• z\ , z 2 , Z3 , Z4 G R where R denotes all vertices except the U and the yi 

• z\ =s z 2 and Z3 =5 Z4 where we set S = {y±, . . . , yi}. 

We bound There are less than M l /l\ choices for the (U,yi) and n 4 choices for the 

z\,z 2 , Z3, Z4. Fix those choices. Set R~ = R \ {z\,z 2 , Z3, Z4} and let z G R~ . For each 1 < i < I 

Pr[z = v . Vi ] = 2- k . 
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For each 1 < i, j < I, as \Ui n Uj\ 



2> 



ok 

Pr[(z = Ui y i )A(z= Uj y j )]<2-T. 

We apply the Bonferroni inequality, in the form that the probability of a disjunction is at least 
the sum of the probabilities minus the sum of the pairwise probabilities: 



Pr 



> 12' 



l\ 3k 

2 



These events are independent over the z € R as they involve different adjacencies. Let OK 
denote the event that no z =y i y% for any z € R~ and 1 < i < I. The independence gives: 



Pr[OK] < ( 1-I2' k + ( )2 



We bound 



and 



1-12' 



2~^r < (1 - 2~ fc )' (1 +n 



so that 

Our saving comes from 



1 • ^ ) \ - » /■• 1 



Pr[OK] <y(l + n- L1 ) n <p'(l+o(l)) 



Pr[*i =5 « 2 ] =Prfe =sz A ) =2- 1 . 
The adjacencies on the Zi to S 1 are independent of the event OK. But 

A{ =1 I(U t , yi ) = l OK. 

Thus 



Pr 



(aU^(^, A (zi = S z 2 ) A (z 3 =S z 4 )] < p l 2- 2l (l + o(l)) 



Putting this together 



E[Y\ < ^-nV2- 2i (l + o(l)). 



Recall that Mp = /x « 101og 2 n. The function ^/x! hits a maximum at x = /i where it is less 
than e M . Thus 

M< e . 



/! 



Hence 



E[Y] <e^2- 21 . 

We have selected I « /x so that 

e M 2 -2« = ( e / 4 )M(l+o(l)) = n -A-(i+o(i)) > 

where K = — 101og 2 (e/4) > 4. We deduce 

E[Y] = o(l) 
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so that almost surely there is no such /-tuple. Recall that X was the total number of (U,y) 
with I(U,y) = 1. As E[X] = \i and, from (20), Varpf] = 0[p) with probability 1 - o(l) we 
have X >l. Further, Lemmas 14 and 16 give that whp the extensions have properties (1.)— (2.). 
Thus whp there exists a family of (Ui,yi) of size / which satisfies (l.)-(2.). But also whp any 
such family of size / will satisfy (3.) and (4.). So whp there is such a family. The expansion 
of the family to all (U,y) with I(U,y) = 1 retains the properties (3.)-(4.) as the set S is just 
getting larger. So the theorem is proved. I 



4.3 Putting All Together 

We can now finish the proof of Theorem 12. By Lemma 17 we have whp that all A U (A)- 
similarity classes are singletons except possibly one 2-element class {x, y}. If we let W = Au{x} 
and u = |, then clearly G satisfies all the assumptions of Lemma 13, which implies that 
D2(G) < k + 5, giving the required by (13). 

Finally, let us justify the Remark after Theorem 12. Recall that given k we have chosen n 
so that f(n,k) ~ 101og 2 n and deduced that Dz^G) < k + 5 whp. The probability that the 
(k — l)-extension property fails for G is at most 

^ n ^(1 - 2~ k+1 ) n ~ k = e fcltt«-2- fc+1 n-(l+o(l))fclnfc = y( n ^ e -(i+o(l))fclnfc = Q ^ 

By Lemma 4, k + 1 < D(G). Thus, D(G) and Dz(G) are concentrated on at most 5 different 
values. 



5 Sparse Random Graphs 

The following lemma helps us to deal with very sparse random graphs. Let tk = tk(G) be the 
number of components of G which are order-fc trees. (Thus t±(G) is the number of isolated 
vertices.) For a graph F, let cf{G) be the number of components isomorphic to F. 

Lemma 18 Suppose that for any connectivity component F of a graph G we have 

c F {G)+v{F)<t 1 {G) + l. (26) 
ThenD(G) = D^G) = t 1 (G)+2 unless G is an empty graph (when D(G) = D (G) = v(G) + l). 

Proof. Assume that e(G) / 0. 

The lower bound on D{G) follows by considering G' which is obtained from G by adding 
an isolated vertex. The graphs G and G' are isomorphic as far as non-isolated vertices are 
concerned. The best strategy for Spoiler is to pick t\{G) + 1 isolated vertices in G' and, by 
making one more move in G, to show that at least one of the Duplicator's responses is not an 
isolated vertex. 

On the other hand, let G' ^ G. There must be a connected graph F such that cf(G) / cf(G'), 
say cf(G) < cf(G'). Spoiler picks one vertex from some cf{G) + 1 F-components of G' . If a 
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move x of Duplicator falls into the same component of G as some her previous move y, then 
Spoiler switches to G and begins claiming a contiguous path from x to y; he wins in at most 
v(F) moves by either connecting x to y or by claiming a path of length v(F) + 1. 

Otherwise, Duplicator must have selected a vertex inside a component C of G which is not 
isomorphic to F. As soon as this happens, Spoiler wins by growing a connected set inside the 
larger component of the two, in at most v(F) moves. 

The total number of moves does not exceed (cf(G) + 1) + v (F) < t\(G) + 2 (while we have only 
one alternation), as required. I 

Theorem 19 Let e > be fixed. Letp = p{n) < (a — e)n~ l , where a = 1.1918... is the (unique) 
positive root of the equation 

Then whp G G Q(n,p) satisfies the condition (26). 
In this range, whp D(G) = D^G) = {e~ pn + o(l)) n. 

Proof. It is easy to compute the expectation of t k (G) for G £ Q{n,p): 

X k = E[t k ] = ^j k k-2 p k-l q k(n-k) + ( k 2 )-k+l 

k—1 uk—2 

Let c = pn. For a fixed k we have = (f k + o(l))n, where f k := c fc!( f efc — . We have 

lh±l = ce ~c x (1 + l/fc)*-2. 
fk 

The first factor ce _c is at most 1/e (maximized for c = 1). Unexciting algebraic calculations 
show that the second factor is monotone increasing for k > 1 and approaches e in the limit. 
This implies that the sequence fk is decreasing in k. (In particular, fi is strictly bigger than 
any other fi, i > 2.) 

Also, /i = e~ c > 0.3 for < c < a. 

Theorem 5.7 in Bollobas [4] describes the structure of a typical G for p = 0(n _1 ). In particular, 
it implies that there is a constant K such that whp at least 0.9n vertices of G belong either 
to tree components of orders at most K or to the giant component. The giant component (for 
c > 1) has order (1 — | + o(l))n, where s is the only solution of se~ s = ce~ c in the range 
< s < 1. It is routine to see that /i > 1 — f . (In fact, c = a is the root of fi = |; this is 
where a comes from.) 

A theorem of Barbour [3] (Theorem 5.6 in [4]) implies that, for any k < K, we have whp 

\t k (G)-X k \<o(n). (27) 

Now, we have all the ingredients we need to check (26). Let F C G be any connectivity 
component. If F is the giant component of G, then cf(G) = 1 but, as we have seen, v(F) < ti(G) 
so (26) holds. So we can assume that v(F) = o(n). Ifv(F) > K, then c F (G)+v{F) < < 
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ti(G). If F is a tree with k G [2, if] vertices, then (27) and the inequality fi > imply the 
required. Finally, it remains to assume that the component F of order at most K contains 
a cycle. But the expected number of such components is at most ( n )pV _1 (2) = ^(-0- 

Markov's inequality implies that whp no such F violates (26). I 

Of course, the value of ti(G) can be estimated more precisely for some p than we did in 
Theorem 19. Without going into much details, let us describe some of the cases here. Let to be 
any function of n which (arbitrarily slowly) tends to the infinity with n. 

If n 2 p — > 0, then whp we have isolated vertices and edges only. The distribution of ^(G) = e(G) 
approaches the Poisson distribution V\ 2 . Hence, we have whp that n — u) < D(G) < n + 1. 

Suppose that n 2 p but pn — > 0. The expected number of vertices in components of order at 
least 3 is at most n(J^)3p 2 = o{\2)- By Markov's inequality, whp we have o{\2) such vertices. 
On the other hand, the distribution of t-z{G) is o(l)-close to V\ 2 (Theorem 5.1 in [4]). Hence, 

D(G)-n+^| <o{\ 2 )+lo. 

Observe that there is no phase transition in the behavior of D(G) at p ~ K This should not 
be surprising: D(G) is determined by ti(G) in this range. We believe (but were not able to 
prove) that whp D(G) = t\(G) + 2 for p = 0(n _1 ). To show this it is enough to define the 
giant component by a sentence of depth o(n). In fact, we conjecture that a far stronger claim 
is true. 



Conjecture 20 Let ^£ < p = 0(n 1 ). Then whp D(F) = O(lnn), where F is the giant 
component of G G Q(n,p). 



6 Modeling Arithmetics on Graphs 

In this section we consider D(G) for the random graph G £ Q{n,p) where p = n^ 1 / 4 . We 
expect that our results would hold for p = n~ a for any rational a G (0,1), but this would 
require considerable technical work so we are content with this one case. In [20, Section 8] 
it was shown, for a = |, that there was an arithmetization of certain sets that led to non- 
convergence and non-separability results. Our methods here will be similar. 

Theorem 21 Let p = n" 1 / 4 and G G G(n,p). Then whp 

D(G) = 6(log*n). 

The lower bound is very general. We use only the trivial fact that any particular graph is the 
value of the random graph with probability 0(n _1 ). (Indeed, the value is exponentially small.) 
Let F(k) be the number of first order sentences of depth at most k. Then Pr[D(G) < k] = 
0(F(k)n^ 1 ) as there are at most F(k) such graphs. From general principles [20, Section 2.2] 
F(k) is bounded effectively by the tower function so that if k = clog* n with c appropriately 
small F(k) = o(n) and Pr[D(G) < k] = o(l). 
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Now we turn to the main part, bounding D(G) from above. For any set W of vertices let N(W) 
denote the set of common neighbors of W. When \W\ = 4, Pr[N(W) = 0] = (1 — p A ) n ~ A ss e _1 . 
We are guided by the idea that N(W) = is like a random symmetric 4-ary predicate with 
probability e _1 , which is bounded away from both zero and one. 

Let W be a set of four vertices. Dependent only on W we define 

• A = N(W), the common neighbors of W; 

• B, consisting of those z £ W U A such that z is adjacent to precisely four vertices of A 
and no other z' $ W U A has exactly the same adjacencies to A. 

For w £ W U A let H W (A) denote the 3-regular hypergraph on A consisting of those triples 
T so that there is no z G" W U A adjacent to T U {w}. (The condition that z £ W U A is a 
technical convenience that does not asymptotically affect the H W (A).) If, further, a G A we let 
H W:a (A) denote the 2-regular hypergraph (i.e. graph) of pairs T with TU{a} G H W (A). Further, 
for distinct a, b G A we let H Wta ^(A) denote the 1-regular hypergraph (i.e. set) of elements y 
with {a, b, y} G H W (A). For w G" W U A U B let H W (B) denote the 3-regular hypergraph on B 
consisting of those triples T so that there is no z G" W LI AU B adjacent to T U {w}. (Again, 
the condition z^WLiAuBisa technical convenience.) Informally, the idea is that the 
H W (A),H W (B) act like random objects with probability e -1 . 

Call A universal if the H V (A), v £ W U A, range over all 3-regular hypergraphs on A. Call 
B splitting if the H V (B), v W U A U B, are all different. As there are 

2 e(m 3 ) 3 _ regu i ar 

hypergraphs over an m-set a simple counting argument gives that if A is universal we must 
have \A\ = 0(hi l ^n) while if B is splitting we must have \B\ = VtiAv^^n). 

Our argument splits into two lemmas. 

Lemma 22 Whp there exists a 4-set W such that, with A, B as defined above, 

1. A is universal; 

2. B is splitting. 

Lemma 23 Any graph G on n vertices with the property of Lemma 22 has D(G) = 0(log* n). 

Note that the proof of Lemma 22 is a random graph argument while the proof of Lemma 23 is 
a logic argument involving no probability. 

Proof of Lemma 22. Set u = [In ' 3 n\ . For any set W of four vertices 

Pr[ \N(W)\ =u}= Pr[Bin(n - A,p A ) = u] w e" 1 ju\. 

Thus the expected number fi of such W has fi ©e 1 / 1 ^ which approaches infinity. An 
elementary second moment calculation gives that the number of such W is (1 + o(l)) fi whp. 
Hence it suffices to show that the expected number of W with A having size u but A, B failing 
the conditions of Lemma 22 is o(/x). Fix W and A of size u. It suffices to show that A, B satisfy 
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Lemma 22 whp. The conditioning is only on the adjacencies involving a vertex of W, all other 
adjacencies remain random. 

First we show that A is universal. Let Z be those vertices adjacent to four or more vertices of 
A. Let Z' consist of vertices with at least one neighbor in Z. Whp every four vertices in the 
graph have O(lnn) common neighbors so \Z\ is polylog while \Z'\ = n 3 / 4+ °^ 1 \ 

For each 3-set Y C A let N~(Y) denote those v G" WuAuZ which are adjacent to all vertices of 
Y. The |iV _ (Y)| are independent, each with Binomial distribution Bin(n— o(n), (l+o(l)) n -3 / 4 ). 
The probability that |iV~(Y)| > 2n 1 / 4 or \N~(Y)\ < ^n 1 / 4 is then less than exp(-cra 1/4 ) for a 
constant c by Chernoff bounds. We only need that this probability is o(n~ 3 ). Thus with high 
probability for every 3-set Y C Awe have |iV~(Y)| G [\n l / A , 2n 1 / 4 ]. Condition on these N~{Y) 
satisfying these conditions. Let R be the (remaining) vertices, not in W, A, Z, Z' nor any of 
the N~(Y). For z G R the adjacencies to the N~(Y) are still random. For such z we have 
Y G H Z (A) if and only if z is adjacent to no vertex in N~(Y). (Note that z does not send any 
edges to Z.) As |iV _ (Y)| < 2n 1 / 4 the probability that z is adjacent to no vertex of N (Y) is 
at least e~ 2 . As |iV~(Y)| > ±n 1/4 the probability that z is adjacent to some vertex of N (Y) 
is at least 1 — e _1//2 . Set 7 = min(e~ 2 , 1 — e -1 / 2 ). Then for any hypergraph H on A we have 
Pt[H z (A) = H}> 7(3). But these events are now independent over the (1 — o(l)) n values z G R 
so that the probability that no H Z (A) = H is less than (1 — 7(3))". Here because u 3 = o(lnn) 
this quantity is less than, say, exp(— n a "). There are fewer than 2 U hypergraphs H on A. 
Hence the probability that any such H is not one of the HJA) is less than 2" exp(— n 0,99 ). 

3 

The 2" term is basically negligible and the probability that A is not universal is less than 
exp(— n 098 ) and certainly o(l). We note that A being universal will not be fully needed in 
Lemma 23, we shall need only seven particular values of H Z (A). 

Now we look at the size of B. For each z AuW the probability that z is adjacent to precisely 
four elements of A is (^)p 4 (l —p) u ~ 4 ~ -u 4 n -1 /24 and given this the probability that no other z' 
has the same adjacencies is approximately (1 — p 4 ) n « e _1 so B has expected size [i w n 4 /24e. 
A second moment calculation gives that whp \B\ w /x = 6(ln L2 n). 

Finally we show that B is splitting. At this stage W, A, B are fixed and all of the adjacencies 
that do not have at least one vertex from 1YUj4 are random. Whp no z W L) Au B is adjacent 
to five (or more) vertices of B. Let Z be those z W U A U B adjacent to four vertices of B. 
Whp \Z\ is polylog. 

For each 3-set Y C B we let N*(Y) denote those v W U A U B which are adjacent to all 
vertices of Y and N~(Y) = N*(Y) - Z. As before, whp all \N*(Y)\ have size between \n l l A 
and 2TJ 1 / 4 and so the same, asymptotically, holds for the \N~(Y)\. As before we fix the N*(Y) 
and their adjacencies to Y. Consider distinct u, u' G" W U A U B. The probability that either u 
or u' is adjacent to nine (or more) vertices of Z is o(n~ 2 ). Call a 3-set Y C B exceptional if u 
or u' is adjacent to some z G Z which is adjacent to all of Y. With probability 1 — o(n~ 2 ) there 
are at most 2 • 8 • 4 = 64 exceptional Y. Hence the number of nonexceptional Y is ss ('f )• For 
the nonexceptional Y we have Y G H U (B) if and only if u is adjacent to no vertex in N~(Y) 
and similarly for Y G H U /{B). Thus Pr[Y G i7 u (-E>)] G [7, 1 — 7] with 7 as previously defined. 
Further these events are independent over different nonexceptional Y. Set 7* = 7 2 + (1 — 7) 2 . 
Then H U (B),H U >(B) agree on a nonexceptional Y with probability at most 7*. Independence 
gives that they agree on all nonexceptional Y with probability at most 7* to the ~ ('^') power. 
As this power is 3> Inn the probability is certainly o(n~ 2 ). There are 0{n?) choices of u, v! so 
whp no H U {B) = H U ,{B). I 
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Proof of Lemma 23. The main portion of the argument consists of placing an arithmetic 
structure on A in such a way that any vertex in A can be described with quantifier depth 
0(log* \A\) = 0(log*n). For convenience we assume \A\ = 3s + 2. (Otherwise we would give 
the one or two extra elements of A which would increase the depth by at most two.) Label the 
elements of A by a, b, x\, . . . , x s , yi, . . . , y s , z±, . . . , z s in an arbitrary way. Now we, effectively, 
model arithmetic on A. From the universality there exist w\,... ,wj (witnesses) such that 

1. H Wl consists of the triples {xi,yi,Zi} 

2. H W2 ,a,b consists of the elements xi 

3. H w . iA £ consists of the elements yi 

4. H W4;a consists of all pairs {xi,yj} with i < j 

5. H W5 consists of all triples {xi,yj, Zi+j} with 1 < i, j,i + j < s 

6. H Wlj consists of all triples {xi,yj, Zi.j} with 1 < i, j,i ■ j < s. 

7. H W7 ^ a consists of all pairs {xi,y 2 i} with 1 < i < s and 2 % < s. 

We give a first order expression in terms of the ^1,^2,^3,^4 which define A and the special 
elements a, b € A and the witnesses wi,. .. ,wj which forces A to have this form. Note that 
membership in A is given by a first order statement and membership in an H w or H w a or 
H Wj a,b is given by a first order statement in terms of the variables. Let A~ = A — {a, b) for 
convenience. Now we express the following six properties. 

1. (1-Factor) H Wl consists of vertex disjoint triples and every element of A~ , and only those 
elements, are in such a triple. 

2. (Splitting the 1-Factor) For each triple in H Wl exactly one of the elements is in H W2A ^ 
and exactly one (a different one) is in H W3 ^. Now let (for convenience) X denote those 
x G A~ with x G H W2 ,a,b an d let Y denote those y € A~ with y £ H W3 ,a,b an d Z the 
other elements of A~ . Henceforth the use of the letter x,y,z shall tacitly assume that 
the element is in the respective set X, Y, Z. We write x~yorx~zory~zif the two 
elements are in a common triple in H Wl . 

3. (Creating <) Here adjacency is in H w ^ a . We require that all adjacencies be between an 
x and a y. Let N(x) denote the y adjacent to x. We require that for every x, x' either 
N(x) C N(x') or N(x') C N(x) with equality only when x = x'. We require that when 
y ~ x then y E N(x). This forces the N(x) to form a chain and so the x and y can be 
renumbered to fit the condition. We now define x < x' by N(x') C N(x). The relations 
> , > , < have their natural Boolean meaning in terms of < . We define y < y' and z < z' 
by x < x' where x ~ y ~ z and x' ~ y' ~ z'. We let x\,yi, z\ denote the first elements 
under < and x s ,y s ,z s the last elements. The notions of successor x + and predecessor x~~ 
are naturally defined (when they exist) in terms of <. We let Z2 denote the successor of 

4. (Creating addition) Addition is generated from the formulas a + 1 = a + and a + j3 + = 
(q + (3) + , though we need some care as addition in this model is not always defined. For 
every x G X, y G Y there is at most one z G Z with {x,y, z} G H Ws . {^i, 2/1,2:2} G H W5 . 



24 



For x / x s , {x, y±,z} G H W5 where z ~ x + . If {x, y, z} G H W5 and y, z have successors then 
{x,y + ,z + } G i? W5 . If {x,y,z} G ii^g and y, z have predecessors then {x,y~ , z~} £ ff W5 . 
We let x + x' = x* denote that {x,y', z*} G where 2/ ~ x' and y* ~ x*. Let x + z = z' 
mean that when z, z' are replaced by their ~ elements in x that then we have the equality, 
and similarly for other forms like y + y' = z. 

5. (Creating multiplication) Multiplication is generated from the formulas a ■ 1 = a and 
a ■ (3 + = (a • (3) + (3, though we need some care as addition in this model is not always 
defined. For every x G X, y G Y there is at most one z G Z with {x,y, z} G H W6 . 
{x,yi,z} G H WB precisely when x ~ z. If {x,y, z} G H W6 and y has a successor then 
{x, y + , z'} G H m if and only if z' = x + z. 

6. (Creating exponentiation) Base two exponentiation is defined by 2 1 = 2 and 2 a+ = 2 a +2", 
though we need some care as addition in this model is not always defined. For every x G X 
there is at most one y G Y with {x,y} G H W7>a . {x 1 ,y 2 } G H W7A . If {x,y} G iJ TO7 , a then 
{x + ,y'} G i? W7i a if and only \fy + y = y'. We write x' = 2 X if {x, y'} G i? W7) a and x' ~ y'. 

We can give the d-th binary digit of x (we count the first digit as the first on the right, zero if 
and only if x is even) for all x G X. The 1-st digit of x is zero if and only if x = x' + x' for some 
x' . Otherwise the d-th digit is zero if and only if there exist q,r x = q ■ 2 d + r with r < 2 d ~ 1 
or if x = q ■ 2 d or if x < 2 d ~ 1 . (This technical complication is caused by leaving zero out of 
the model.) This is first order as we already have multiplication, exponentiation, less than and 
addition. 

Now any x < s is described with quantifier depth 0(log* s). Let x have m digits. We say that 
x < 2 m and the disjunction over d < m of the statements that the <i-th binary digit is what it 
is. For each such d (and for m) we have to describe d. But now we are describing numbers up 
to log 2 s and so by an induction the total depth will be ©(log* s). This also includes describing 
the last element x s so that we determine s with depth 0(log* s). 

Now the elements v of B are described with depth 0(log* s) by describing the four vertices of 
A that v is adjacent to and saying that v is adjacent to no other vertices of A and that no other 
w W U A has just those adjacencies to A. 

Finally any v^WUAuBis described in depth 0(log* s) by listing the edges of H V (B) and 
stating that no other v' produces the same hypergraph. The assumption B splitting means that 
we have described all the vertices. I 



7 Concluding Remarks 

Our Theorem 2 has a strong link with the zero-one law which was discovered independently 
by Glebskii et al [10] and Fagin [8] and says that G G G{n, \) satisfies any fixed first order 
sentence with probability approaching either or 1. Given e G (0, 1), define T e {n) to be the 
maximum k such that, if ni,ri2 > n, then D(G,H) > k with probability at least 1 — e for 
independent G G Q{ni,p) and H G G(ri2,p). The Bridge Theorem [20, Theorem 2.5.1] says 
that, in a rather general setting, a zero-one law is obeyed iff, for each e, T e (n) tends to the 
infinity as n increases. Spencer and St. John [21] call T e {n) the tenacity function and suggest it 
as a quantitative measure for observation of a zero-one law. While in [21] the tenacity function 
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is studied for words, here we are able to find its asymptotics in the case of graphs. Since the 
lower bound based on the /c-extension property goes through for D(G,H) with both G and H 
random, we have T e (n) = \og 1 / p n + O(lnlnn), irrespective of the constant e. 

Another interesting first order parameter of a graph G is 1(G), the smallest depth of a sentence 
distinguishing G from any non-isomorphic graph of the same order as G. Of course, 1(G) < 
D(G), so all upper bounds we have proved apply to 1(G) as well. All our lower bounds also apply 
to 1(G) with the exceptions of Theorem 19. Its /(G)-analog would say that, for G G Q(n, ^) 
with c < Co, where Co = 1.034..., we have whp 

1(G) = (1 + o(l)) t 2 (G) = («r 2c /2 + o(l)) n, (28) 

where t 2 (G) denotes the number of isolated edges. The reason is that if G ^ G' but v(G) = 
v(G'), then the multiplicities of at least two non-isomorphic components must differ while the 
two most frequent components in G are isolated vertices and edges. (And the order of the giant 
component catches up with t 2 (G) at p ^.) The Reader should not have any problem in filling 
up the missing details. 

We make the following general conjecture. 

Conjecture 24 Let e > be fixed and n~ 1+£ <p<\. Then whp D(G) = O(lnn). 

One can also ask about D#, the analog of D(G) when we add counting to first order logic. Here, 
the situation is strikingly different. A result of Babai and Kucera [2] (combined with Immerman 
and Lander's [11] logical characterization of the vertex refinement step in [2]) implies that whp 
G € Q(n, i) can be defined by a first order sentence with counting of quantifier depth at most 4. 
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